Goto

Collaborating Authors

 interarrival time


Bayesian Decentralized Decision-making for Multi-Robot Systems: Sample-efficient Estimation of Event Rates

arXiv.org Artificial Intelligence

Abstract-- Effective collective decision-making in swarm robotics often requires balancing exploration, communication and individual uncertainty estimation, especially in hazardous environments where direct measurements are limited or costly. We propose a decentralized Bayesian framework that enables a swarm of simple robots to identify the safer of two areas, each characterized by an unknown rate of hazardous events governed by a Poisson process. Robots employ a conjugate prior to gradually predict the times between events and derive confidence estimates to adapt their behavior . Our simulation results show that the robot swarm consistently chooses the correct area while reducing exposure to hazardous events by being sample-efficient. Compared to baseline heuristics, our proposed approach shows better performance in terms of safety and speed of convergence. The proposed scenario has potential to extend the current set of benchmarks in collective decision-making and our method has applications in adaptive risk-aware sampling and exploration in hazardous, dynamic environments. Collective decision-making under uncertainty is a fundamental challenge in multi-robot systems, including domains such as collective perception, environment classification, and spatial consensus [1]-[4]. Decentralized systems (e.g., robot swarms) operate under strict limitations on sensing, communication, and memory. Instead of sharing/storing complete observation histories, robots must maintain compact model representations of their knowledge. It is crucial to develop efficient strategies for collective decision-making, especially when observations are sparse, noisy [5], and gathered from stochastic processes [6]. This is typically characterized as a best-of-n problem [3], [7].


Design and Scheduling of an AI-based Queueing System

arXiv.org Artificial Intelligence

To leverage prediction models to make optimal scheduling decisions in service systems, we must understand how predictive errors impact congestion due to externalities on the delay of other jobs. Motivated by applications where prediction models interact with human servers (e.g., content moderation), we consider a large queueing system comprising of many single server queues where the class of a job is estimated using a prediction model. By characterizing the impact of mispredictions on congestion cost in heavy traffic, we design an index-based policy that incorporates the predicted class information in a near-optimal manner. Our theoretical results guide the design of predictive models by providing a simple model selection procedure with downstream queueing performance as a central concern, and offer novel insights on how to design queueing systems with AI-based triage. We illustrate our framework on a content moderation task based on real online comments, where we construct toxicity classifiers by finetuning large language models.


Inference and Sampling of Point Processes from Diffusion Excursions

arXiv.org Machine Learning

Point processes often have a natural interpretation with respect to a continuous process. We propose a point process construction that describes arrival time observations in terms of the state of a latent diffusion process. In this framework, we relate the return times of a diffusion in a continuous path space to new arrivals of the point process. This leads to a continuous sample path that is used to describe the underlying mechanism generating the arrival distribution. These models arise in many disciplines, such as financial settings where actions in a market are determined by a hidden continuous price or in neuroscience where a latent stimulus generates spike trains. Based on the developments in It\^o's excursion theory, we propose methods for inferring and sampling from the point process derived from the latent diffusion process. We illustrate the approach with numerical examples using both simulated and real data. The proposed methods and framework provide a basis for interpreting point processes through the lens of diffusions.


A Generative Approach for Production-Aware Industrial Network Traffic Modeling

arXiv.org Artificial Intelligence

The new wave of digitization induced by Industry 4.0 calls for ubiquitous and reliable connectivity to perform and automate industrial operations. 5G networks can afford the extreme requirements of heterogeneous vertical applications, but the lack of real data and realistic traffic statistics poses many challenges for the optimization and configuration of the network for industrial environments. In this paper, we investigate the network traffic data generated from a laser cutting machine deployed in a Trumpf factory in Germany. We analyze the traffic statistics, capture the dependencies between the internal states of the machine, and model the network traffic as a production state dependent stochastic process. The two-step model is proposed as follows: first, we model the production process as a multi-state semi-Markov process, then we learn the conditional distributions of the production state dependent packet interarrival time and packet size with generative models. We compare the performance of various generative models including variational autoencoder (VAE), conditional variational autoencoder (CVAE), and generative adversarial network (GAN). The numerical results show a good approximation of the traffic arrival statistics depending on the production state. Among all generative models, CVAE provides in general the best performance in terms of the smallest Kullback-Leibler divergence.


Mitigating Performance Saturation in Neural Marked Point Processes: Architectures and Loss Functions

arXiv.org Artificial Intelligence

Attributed event sequences are commonly encountered in practice. A recent research line focuses on incorporating neural networks with the statistical model -- marked point processes, which is the conventional tool for dealing with attributed event sequences. Neural marked point processes possess good interpretability of probabilistic models as well as the representational power of neural networks. However, we find that performance of neural marked point processes is not always increasing as the network architecture becomes more complicated and larger, which is what we call the performance saturation phenomenon. This is due to the fact that the generalization error of neural marked point processes is determined by both the network representational ability and the model specification at the same time. Therefore we can draw two major conclusions: first, simple network structures can perform no worse than complicated ones for some cases; second, using a proper probabilistic assumption is as equally, if not more, important as improving the complexity of the network. Based on this observation, we propose a simple graph-based network structure called GCHP, which utilizes only graph convolutional layers, thus it can be easily accelerated by the parallel mechanism. We directly consider the distribution of interarrival times instead of imposing a specific assumption on the conditional intensity function, and propose to use a likelihood ratio loss with a moment matching mechanism for optimization and model selection. Experimental results show that GCHP can significantly reduce training time and the likelihood ratio loss with interarrival time probability assumptions can greatly improve the model performance.


Approximate Tolerance and Prediction in Non-normal Models with Application to Clinical Trial Recruitment and End-of-study Success

arXiv.org Machine Learning

A prediction interval covers a future observation from a random process in repeated sampling, and is typically constructed by identifying a pivotal quantity that is also an ancillary statistic. Outside of normality it can sometimes be challenging to identify an ancillary pivotal quantity without assuming some of the model parameters are known. A common solution is to identify an appropriate transformation of the data that yields normally distributed observations, or to treat model parameters as random variables and construct a Bayesian predictive distribution. Analogously, a tolerance interval covers a population percentile in repeated sampling and poses similar challenges outside of normality. The approach we consider leverages a link function that results in a pivotal quantity that is approximately normally distributed and produces tolerance and prediction intervals that work well for non-normal models where identifying an exact pivotal quantity may be intractable. This is the approach we explore when modeling recruitment interarrival time in clinical trials, and ultimately, time to complete recruitment.


An online learning approach to dynamic pricing and capacity sizing in service systems

arXiv.org Machine Learning

We study a dynamic pricing and capacity sizing problem in a GI/GI/1 queue, where the service provider's objective is to obtain the optimal service fee $p$ and service capacity $\mu$ so as to maximize cumulative expected profit (the service revenue minus the staffing cost and delay penalty). Due to the complex nature of the queueing dynamics, such a problem has no analytic solution so that previous research often resorts to heavy-traffic analysis in that both the arrival rate and service rate are sent to infinity. In this work we propose an online learning framework designed for solving this problem which does not require the system's scale to increase. Our algorithm organizes the time horizon into successive operational cycles and prescribes an efficient procedure to obtain improved pricing and staffing policies in each cycle using data collected in previous cycles. Data here include the number of customer arrivals, waiting times, and the server's busy times. The ingenuity of this approach lies in its online nature, which allows the service provider do better by interacting with the environment. Effectiveness of our online learning algorithm is substantiated by (i) theoretical results including the algorithm convergence and regret analysis (with a logarithmic regret bound), and (ii) engineering confirmation via simulation experiments of a variety of representative GI/GI/1 queues.


Universal Approximation with Neural Intensity Point Processes

arXiv.org Machine Learning

We propose a class of neural network models that universally approximate any point process intensity function. Our model can be easily applied to a wide variety of applications where the distribution of event times is of interest, such as, earthquake aftershocks, social media events, and financial transactions. Point processes have long been used to model these events, but more recently, neural network point process models have been developed to provide further flexibility. However, the theoretical foundations of these neural point processes are not well understood. We propose a neural network point process model which uses the summation of basis functions and the function composition of a transfer function to define point process intensity functions. In contrast to prior work, we prove that our model has universal approximation properties in the limit of infinite basis functions. We demonstrate how to use positive monotonic Lipschitz continuous transfer functions to shift universal approximation from the class of real valued continuous functions to the class of point process intensity functions. To this end, the Stone-Weierstrass Theorem is used to provide sufficient conditions for the sum of basis functions to achieve point process universal approximation. We further extend the notion of universal approximation mentioned in prior work for neural point processes to account for the approximation of sequences, instead of just single events. Using these insights, we design and implement a novel neural point process model that achieves strong empirical results on synthetic and real world datasets; outperforming state-of-the-art neural point process on all but one real world dataset.